AITopics | bayesian logistic regression

Collaborating Authors

bayesian logistic regression

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

Aaron Mishkin, Frederik Kunstner, Didrik Nielsen, Mark Schmidt, Mohammad Emtiyaz Khan

Neural Information Processing SystemsFeb-14-2026, 17:23:51 GMT

Uncertainty estimation in large deep-learning models is a computationally challenging task, where it is difficult to form even a Gaussian approximation to the posteriordistribution.

approximation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Appendices

Neural Information Processing SystemsFeb-11-2026, 20:01:33 GMT

Additionally, to avoid gradients with infinite means even if DL is not contractive, we consider a spectral normalisation, so that instead of computing recursively η0 = ε and ηk = DLηk 1 for k {1,...,N},weset η0 =εand The motivation was to have a quadratic increase for the penalty term if the largest absolute eigenvalue approaches 1, and then smoothly switch to a linear function for values larger than δ2. The suggested approach can perform poorly for non-convex potentials or even convex potentials such as arsing in a logistic regression model for some data sets. The idea now is to run HMC with unit mass matrix for the transformed variables z = f 1(q) where q π. Hessian-vector products can similarly be computed using vector-Jacobian products: With g(z) = grad( U,z), we then compute 2 U(z)w = vjp(g,z,w)> for z = f 1(stop grad(f(zbL/2c)). We also stop all U gradients, i.e.

acceptance rate, artificial intelligence, machine learning, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Add feedback

Online sampling from log-concave distributions

Holden Lee, Oren Mangoubi, Nisheeth Vishnoi

Neural Information Processing SystemsFeb-11-2026, 19:35:53 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, gradient, logistic regression, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

Aaron Mishkin, Frederik Kunstner, Didrik Nielsen, Mark Schmidt, Mohammad Emtiyaz Khan

Neural Information Processing SystemsNov-20-2025, 20:18:14 GMT

To address this issue, we propose a new stochastic, low-rank, approximate natural-gradient (SLANG) method for variational inference in large, deep models. Our method estimates a "diagonal

approximation, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > Canada > British Columbia (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(2 more...)

Genre: Research Report (0.31)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Myopic Bayesian Decision Theory for Batch Active Learning with Partial Batch Label Sampling

Hu, Kangping, Mussmann, Stephen

arXiv.org Machine LearningOct-14-2025

Over the past couple of decades, many active learning acquisition functions have been proposed, leaving practitioners with an unclear choice of which to use. Bayesian Decision Theory (BDT) offers a universal principle to guide decision-making. In this work, we derive BDT for (Bayesian) active learning in the myopic framework, where we imagine we only have one more point to label. This derivation leads to effective algorithms such as Expected Error Reduction (EER), Expected Predictive Information Gain (EPIG), and other algorithms that appear in the literature. Furthermore, we show that BAIT (active learning based on V-optimal experimental design) can be derived from BDT and asymptotic approximations. A key challenge of such methods is the difficult scaling to large batch sizes, leading to either computational challenges (BatchBALD) or dramatic performance drops (top-$B$ selection). Here, using a particular formulation of the decision process, we derive Partial Batch Label Sampling (ParBaLS) for the EPIG algorithm. We show experimentally for several datasets that ParBaLS EPIG gives superior performance for a fixed budget and Bayesian Logistic Regression on Neural Embeddings. Our code is available at https://github.com/ADDAPT-ML/ParBaLS.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

2510.09877

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Online sampling from log-concave distributions

Holden Lee, Oren Mangoubi, Nisheeth Vishnoi

Neural Information Processing SystemsOct-2-2025, 11:17:53 GMT

In this paper, we study the following online sampling problem: Problem 1.1.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report > New Finding (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

Appendices A Gradient terms for the adaptation scheme

Neural Information Processing SystemsAug-18-2025, 17:29:49 GMT

A.1 Gradients for the entropy approximation Following the arguments in [13], we can compute the gradient of the term in (13) using θ A.2 Gradients for the penalty function We used the following penalty function h( x) = ( x δ) A.3 Gradients for the energy error We can write the energy error as (q We generalise the arguments from [14], Lemma 7. Proceeding by induction over n, we have for the case n = 1, for any v R The suggested approach can perform poorly for non-convex potentials or even convex potentials such as arsing in a logistic regression model for some data sets. We illustrate here how to learn a reasonable proposal for a general potential function by considering some version of position-dependent preconditioning. The transformation f as well as U generally depend on some parameters θ that we again omit for a less convoluted notation. Our approach can be seen as an alternative for instance to [31] where such a transformation is first learned by trying to approximate π with a standard Gaussian density using variational inference, while the HMC hyperparameters are adapted in a second step using Bayesian optimisation. The motivation for stopping the gradients comes from considering the special case f: z null Cz that corresponds to the position-independent preconditioning scheme above.

artificial intelligence, effective sample size, machine learning, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Add feedback

c61aed648da48aa3893fb3eaadd88a7f-Supplemental.pdf

Neural Information Processing SystemsAug-17-2025, 07:22:51 GMT

experiment, iteration iteration iteration iteration, theorem 2, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Hotel Booking Cancellation Prediction Using Applied Bayesian Models

Jishan, Md Asifuzzaman, Singh, Vikas, Ghosh, Ayan Kumar, Alam, Md Shahabub, Mahmud, Khan Raqib, Paul, Bijan

arXiv.org Artificial IntelligenceOct-23-2024

This study applies Bayesian models to predict hotel booking cancellations, a key challenge affecting resource allocation, revenue, and customer satisfaction in the hospitality industry. Using a Kaggle dataset with 36,285 observations and 17 features, Bayesian Logistic Regression and Beta-Binomial models were implemented. The logistic model, applied to 12 features and 5,000 randomly selected observations, outperformed the Beta-Binomial model in predictive accuracy. Key predictors included the number of adults, children, stay duration, lead time, car parking space, room type, and special requests. Model evaluation using Leave-One-Out Cross-Validation (LOO-CV) confirmed strong alignment between observed and predicted outcomes, demonstrating the model's robustness. Special requests and parking availability were found to be the strongest predictors of cancellation. This Bayesian approach provides a valuable tool for improving booking management and operational efficiency in the hotel industry.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2410.16406

Country:

North America > United States > Nebraska > Lancaster County > Lincoln (0.14)
Europe > Germany > Brandenburg > Potsdam (0.05)
North America > United States > Louisiana > Lincoln Parish > Ruston (0.04)
(3 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Consumer Products & Services > Hotels (1.00)
Transportation > Ground > Road (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Quasi-Newton Methods for Markov Chain Monte Carlo

Neural Information Processing SystemsMar-15-2024, 13:06:54 GMT

The performance of Markov chain Monte Carlo methods is often sensitive to the scaling and correlations between the random variables of interest. An important source of information about the local correlation and scale is given by the Hessian matrix of the target distribution, but this is often either computationally expensive or infeasible. In this paper we propose MCMC samplers that make use of quasi-Newton approximations, which approximate the Hessian of the target distribution from previous samples and gradients generated by the sampler. A key issue is that MCMC samplers that depend on the history of previous states are in general not valid. We address this problem by using limited memory quasi-Newton methods, which depend only on a fixed window of previous samples. On several real world datasets, we show that the quasi-Newton sampler is more effective than standard Hamiltonian Monte Carlo at a fraction of the cost of MCMC methods that require higher-order derivatives.

algorithm, approximation, sampler, (15 more...)

Neural Information Processing Systems

Country: